NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Conducting reproducible research with Umbrella: Tracking, creating, and preserving execution environments

https://doi.org/10.1109/eScience.2016.7870889

Meng, Haiyan; Thain, Douglas; Vyushkov, Alexander; Wolf, Matthias; Woodard, Anna (October 2016, IEEE Conference on e-Science)

Publishing scientific results without the detailed execution environments describing how the results were collected makes it difficult or even impossible for the reader to reproduce the work. However, the configurations of the execution environ- ments are too complex to be described easily by authors. To solve this problem, we propose a framework facilitating the conduct of reproducible research by tracking, creating, and preserving the comprehensive execution environments with Umbrella. The framework includes a lightweight, persistent and deployable execution environment specification, an execution engine which creates the specified execution environments, and an archiver which archives an execution environment into persistent storage services like Amazon S3 and Open Science Framework (OSF). The execution engine utilizes sandbox techniques like virtual machines (VMs), Linux containers and user-space tracers, to cre- ate an execution environment, and allows common dependencies like base OS images to be shared by sandboxes for different applications. We evaluate our framework by utilizing it to reproduce three scientific applications from epidemiology, scene rendering, and high energy physics. We evaluate the time and space overhead of reproducing these applications, and the effectiveness of the chosen archive unit and mounting mechanism for allowing different applications to share dependencies. Our results show that these applications can be reproduced using different sandbox techniques successfully and efficiently, even through the overhead and performance slightly vary.
more » « less
Full Text Available
Scaling up a CMS tier-3 site with campus resources and a 100 Gb/s network connection: what could go wrong?

https://doi.org/10.1088/1742-6596/898/8/082041

Wolf, Matthias; Woodard, Anna; Li, Wenzhao; Anampa, Kenyi Hurtado; Tovar, Benjamin; Brenner, Paul; Lannon, Kevin; Hildreth, Mike; Thain, Douglas (October 2017, Journal of Physics: Conference Series)

Full Text Available
Opportunistic Computing with Lobster: Lessons Learned from Scaling up to 25k Non-Dedicated Cores

https://doi.org/10.1088/1742-6596/898/5/052036

Wolf, Matthias; Woodard, Anna; Li, Wenzhao; Anampa, Kenyi Hurtado; Yannakopoulos, Anna; Tovar, Benjamin; Donnelly, Patrick; Brenner, Paul; Lannon, Kevin; Hildreth, Mike; et al (October 2017, Journal of Physics: Conference Series)

Full Text Available
A new calibration method for charm jet identification validated with proton-proton collision events at √s = 13 TeV

https://doi.org/10.1088/1748-0221/17/03/P03014

Tumasyan, Armen; Adam, Wolfgang; Andrejkovic, Janik Walter; Bergauer, Thomas; Chatterjee, Suman; Dragicevic, Marko; Escalante Del Valle, Alberto; Fruehwirth, Rudolf; Jeitler, Manfred; Krammer, Natascha; et al (March 2022, Journal of Instrumentation)

Abstract Many measurements at the LHC require efficient identification of heavy-flavour jets, i.e. jets originating from bottom (b) or charm (c) quarks. An overview of the algorithms used to identify c jets is described and a novel method to calibrate them is presented. This new method adjusts the entire distributions of the outputs obtained when the algorithms are applied to jets of different flavours. It is based on an iterative approach exploiting three distinct control regions that are enriched with either b jets, c jets, or light-flavour and gluon jets. Results are presented in the form of correction factors evaluated using proton-proton collision data with an integrated luminosity of 41.5 fb -1 at √s = 13 TeV, collected by the CMS experiment in 2017. The closure of the method is tested by applying the measured correction factors on simulated data sets and checking the agreement between the adjusted simulation and collision data. Furthermore, a validation is performed by testing the method on pseudodata, which emulate various mismodelling conditions. The calibrated results enable the use of the full distributions of heavy-flavour identification algorithm outputs, e.g. as inputs to machine-learning models. Thus, they are expected to increase the sensitivity of future physics analyses.
more » « less
Full Text Available

Search for: All records